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Abstract 

In this paper we study von Neumann un-biasing normalisation for ideal and real quan- 
tum random number generators, operating on finite strings or infinite bit sequences. In 
the ideal cases one can obtain the desired un-biasing. This relies critically on the indepen- 
dence of the source, a notion we rigorously define for our model. In real cases, affected by 
imperfections in measurement and hardware, one cannot achieve a true un-biasing, but, if 
the bias "drifts sufficiently slowly" , the result can be arbitrarily close to un-biasing. For 
infinite sequences, normalisation can both increase or decrease the (algorithmic) random- 
ness of the generated sequences. 

A successful application of von Neumann normalisation — in fact, any un-biasing 
transformation — does exactly what it promises, un-biasing, one (among infinitely many) 
symptoms of randomness; it will not produce "true" randomness. 

1 Introduction 

The outcome of some individual quantum-mechanical events cannot in principle be predicted, 
so they are thought as ideal sources of random numbers. An incomplete list of quantum phe- 
nomena used for random number generation include nuclear decay radiation sources [2^, the 
quantum mechanical noise in electronic circuits known as shot noise [27] or photons travelling 
through a semi-transparent mirror |2H [25} [29} [30l [32] . Our methods are primarily developed 
to address these latter photon-based quantum random number generators (QRNGs), one of 
the most direct and popular ways to generate QRNs, but many of our mathematical results 
will be applicable to other QRNGs. 

Due to imperfections in measurement and hardware, the flow of bits generated by a QRNG 
contains bias and correlation, two symptoms of non-randomness [8] The first and simplest 
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technique for reducing bias was invented by von Neumann [M]. It considers pairs of bits, and 
takes one of three actions: a) pairs of equal bits are discarded; b) the pair 01 becomes 0; c) 
the pair 10 becomes 1. Contrary to wide spread claims, the technique works for some sources 
of bits, but not for all. The output produced by an independent source of constantly biased 
bits is transformed (after reducing the number of bits produced significantly) into a flow of 
bits in which the frequencies of O's and I's are equal: 50% for each. As we shall show, a 
stronger property is true: the un-biasing works not only for bits but for all reasonable long 
bit-strings. However, if the bias is not constant the procedure does not work. Finally, von 
Neumann procedure cannot assure "true randomness" in its output. 

To understand the behaviour of QRNGs we need to study the un-biasing transformations 
on both (finite) strings and (infinite) sequences of bits produced by the source. In this paper 
we will focus on von Neumann normalisatioiij^ because it is very simple, easy to implement, 
and (along with the more efficient iterated version due to Peres |24j for which the results will 
also apply) is widely used by current proposals for QRNGs [HI [221 IHl ES] • 

The main results of this paper are the following. In the "ideal case", the von Neumann 
normalised output of an independent constantly biased QRNG is the probability space of the 
uniform distribution (un-biasing). This result is true for both for finite strings and for the 
infinite sequences produced by QRNGs (the QRNG runs indefinitely in the latter case). 

It is important to note that independence in the mathematical sense of multiplicity of 
probabilities is a model intended to correspond to the physical notion of independence of 
outcomes [18] . In order to study the theoretical behaviour of QRNGs, which are based on the 
assumption of physical independence of measurements, we must translate this appropriately 
into our formal model. We carefully define independence of QRNGs to achieve this aim. 

As explained above, QRNGs do not operate in ideal conditions. We develop a model for 
a real-world QRNG in which the bias, rather than holding steady, drifts slowly (within some 
bounds). In this framework we evaluate the speed of drift required to be maintained by the 
source distribution to guarantee that the output distribution is as close as one wishes to the 
uniform distribution. 

We have also examined the effect von Neumann normalisation has on various properties of 
infinite sequences. In particular, Borel normality and (algorithmic) randomness are invariant 
under normalisation, but for e-random sequences with < e < 1, normalisation can both 
decrease or increase the randomness of the source. 

2 Notation 

We present the main notation used throughout the paper. 

By 2^ we denote the power set of X. By |X| we denote the cardinality of the set of X. 

Let B = {0,1} and denote by B* the set of all bit-strings (A is the empty string). If 
X G B* and i £ B then is the length of x and #i{x) represents the number of i's in x. By 

we denote the finite set {x G B* \ n = \x\}. The concatenation product of two subsets 
X, Y of B* is defined by XY = {xy \ x £ X,y £ Y}. If X = {x} then we write xY instead of 
{x}y. By B'^ we denote the set of all infinite binary sequences. For x G B^ and natural n 
we denote by x(n) the prefix of x of length n. We write w \Z v or w m x in case w is a prefix 
of the string v or the sequence x. 

^Many improvements of the scheme have been proposed [141124) . 
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A prefix-free (Turing) machine is a Turing machine whose domain is a prefix-free set of 
strings The prefix complexity of a string, Hw{cr), induced by a prefix- free machine W is 
H\y{(7) = min{|p| : W{p) = a}. Fix a computable e with < e < 1. An e-universal prefix- 
free machine [/ is a machine such that for every machine W there is a constant c (depending 
on U and W) such that e ■ Hu{a) < Hwic) + c, for all a £ B*. If e = 1 then U is simply 
called a universal prefix-free machine. A sequence x G is called e-random if there exists 
a constant c such that Hjj{x{n)) > e ■ n — c, for all n > 1. Sequences that are 1-random are 
simply called random. 

A sequence x is called Borel m-normal (m > 1) if for every 1 < i < 2^" one has: 
limn_^oo A^i^(x(n))/[^J = 2"™; here Nl^{y) counts the number of non-overlapping occur- 
rences of the ith (in lexicographical order) binary string of length m in the string y. The 
sequence x is called Borel normal if it is Borel m-normal, for every natural m > 1. 

A probability space is a measure space such that the measure of the whole space is equal 
to one [5J . More precisely, a (Kolmogorov) probability space is a triple consisting of a sample 
space fi, a fj-algebra on ri, and a probability measure P, i.e. a countably additive function 
defined on with values in [0, 1] such that P{^1) = 1. 

3 The finite case 

3.1 Source probability space and independence 

In this section we define the QRNG source probability space and the independence property. 

Consider a string of n independent bits produced by a (biased) QRNG. Let po,pi be the 
probability that a bit is or 1, respectively, with po + Pi = 1, POiPi ^ 1- 

The probability space of bit-strings produced by the QRNG is (-B", 2-^", P^) where Pn : 



for all X C S". 

It is easy to verify that the Kolmogorov axioms are satisfied for the space (S", 2^^", P„), 
so we have: 

Fact 1. The space (B",2^",P„) with Pn defined in Q is a probability space. 

The space (S", 2^", P„) is just the n-fold product of the single bit probability space 
(5, 2^, Pi). For this reason this space is often called an "independent identically-distributed 
bit source" . The resulting space is "independent" because each bit is independent of previous 
ones. But what is "an independent probability space"? 

Physically the independence of a QRNG is usually expressed as the impossibility of ex- 
tracting any information from the flow of bits xi, . . . ,Xk-i to improve chances of predict- 
ing the value of Xk, other than what one would have from knowing the probability space. 
The fact that photon-based QRNGs obey this physical independence between photons (and 
thus generated bits) rather well [21 [29] is the primary motivation for our modelling of these 
devices. These sources (where the condition of independence still holds) are often termed 



2^" ^ [0, 1] is defined by 




(1) 
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"independent-bit sources" [33j. In a real device we cannot, of course, expect each bit to be 
identically distributed, so we study this more general case more thoroughly in Section [3.5[ 

Formally, two events A,B CI B"^ are independent (in a probability space) if the probability 
of their intersection coincides with the product of their probabilities (a complexity-theoretic 
approach was developed in |12j). This motivates the definition of independence of a general 
source probability space given in Definition [3} But first we need the following simple property: 

Fact 2. For every hit-string x and non-negative integers n,k such that < A; -|- |x| < n we 
have: 

{b'xB-->^-\^\) =pf (^Vf^^^^ = P\.\i{x}). (2) 

Definition 3. The probability space (-B", 2^", Prob„) is independent if for all 1 < /c < n and 
all xi . . .Xk G B'' the events xiX2 ■ ■ ■ Xk-iB"'~''~^^ and B^~^XkB^~^ are independent, i.e. 

Probn (2:1x2 . . . Xk-iXkB''-^) = Probn {xiX2 . . . Xk-iB^'-^+A ■ Probn (B'^-^XkB"-^ 



Fact 4. The probability space (5", 2"^", P„) with Pn defined in ([l]) is independent. 
Proof. Using Q we have: 

Pn ( X,X2 . . . Xk-,XkB--') = p#0(-l--.)p#i(-l--.) 



#o(a:i...Xfc_i) #i(xi...a::fc_i) #o(a:fc) I'fc) 
Po Pi Po Pi 



Pn ( XiX2 . . . Xk-lB^-''^' ]-Pni B^^'^XkB^'^ 



□ 



As we will see later, there are other relevant independent probability spaces. 

3.2 Von Neumann normalisation function 

Here we present formally the von Neumann normalisation procedure. 
We define the mapping F : B^ — )• i? U {A} as 



A if xi = X2, 

Xl if Xl / X2, 



F{xiX2) 

and f : B ^ B'^ as 

f{x) = XX, 

where x = 1 — x. Note that for all x G i? we have F(f(x)) = x and, for all xi,X2 € B with 

a^l / X2, f{F{xiX2)) = XlX2- 

For m < [n/2\ we define the normalisation function VNn,m ■ — )■ ( IJfe<jn ) ^ {'^i 



VNn,m{xi ...Xn)= F{xiX2)F{x^X4) ■■■F {x ^^yr^\^_^-^X^yr^^ . 

Fact 5. For all 1 < m < \n/2\ and y € i?™ there exists an x £ i?" such that y = VNn^mix). 
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Proof. Take x = f{yi)f{y2) ■ ■ ■ /(ym)O"-^'". □ 
In fact we can define the "inverse" normalisation VN~}^ : 2^"" 2^" as 



^^n,m(^) = |«l/(yi)^^2/(2/2) • • • Umf{ym)Um+lV \ V = Vl ■ ■ ■ Vm & Y, 

m+1 ^ 

Ui G {00,11}*,!; G BU{X}, \v\ + 2m + ^ \ui\ = n \ . 

i=i ) 

While this isn't a "true" inverse, for every y G B"^ we have: VNn,n {^^n,rn{y)) — {v}- 



3.3 Tcirget probability space and normalisation 

We now construct the target probability space of the normalised bit-strings over B"^ for 
m < [n/2j, i.e. the probability space of the output bit-strings produced by the application of 
the von Neumann function on the output bit-strings generated by the QRNG. 

The von Neumann normalisation function VNn^m transforms the source probability space 
(S™',2^™,P„) into the target probability space (S™, 2^™, P„_!.m)- The target space of nor- 
malised bit-strings of length 1 < m < [n/2j associated to the source probability space 
(i?™,2^'",P„) is the space (5"^, 2^", P„_,„), where P„_,^ : 2^" ^ [0,1] is defined for all 
Y C B™- by the formula: 

p Pn{VN-UY)) 



Pn {VNn}a{B^)) 

Proposition 6. The target space (S"*, 2^"", P„_^^) of normalised hit-strings of length 1 < 
m < L'^/^J associated to the source probability space (P"*, 2^"*, P„) is a probability space. 

Proof We need to check only additivity: For X,Y C B"", X nV = ^ =^ Pn^m{X yjY) = 
Pn^m{X) + Pn^m{Y). This equality is valid since VN-l^{X U y) = VN-l^{X) U VN-]^{Y) 
and Pn{VN-^^{Y)yjVN-^^{X)) = P„ (FiV-^(y)) + Pn{yN-^^{X)), as VN-^Jx) n 
VN^^iY) = because X and Y are disjoint. □ 

3.4 Normalisation of the output of a source with constant bias 

We now show that von Neumann procedure transforms the source probability space with 

constant bias into the probability space with the uniform distribution over P'", i.e. the target 
probability space (P™, 2^™, P^^^) has Pn->-m = Um, the uniform distribution. Independence 
and the constant bias of Pn play a crucial role. 

Theorem 7 (von Neumann). Assume that 1 < m < [n/2\. In the target probability 
space {B'^,2^'^ , Pn-).m) associated to the source probability space (P'", 2'^'", P„) we have 
Pn^m{Y) = Um{Y) = \Y\ ■ 2'^, for every Y C B^ . 

Proof. Since Pn^m is additive it suffices to show that for any y G P™, P„^^({y}) = 2""*. 
LetZ = P„(FAr-^(P-)). 
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We have (the sums are over all ui G {00, 11}*, u € -B U {A} such that \v\ + Ya=i 
n — 2m): 

p (Sn.W— \^ „#oiuif(yi)...Urrifiym)u,n+iv)^#i{uifiyi)...Umf{y 

Ui,V 

„#0(/{?/l).../{?/m)) #l(/{2/l).../(ym)) ^ ^ ^ 

_ PO Pi #o(ui...Mm + lf) #l(«l...«m + 11^) 

/ .Pq Pi 



Z 



PO Pi ^#0{ui-.Um + lv) #l{ui...U„^ + lv) 



E #0{Ui...Um + lv) ii 
Po Pi 



Z 

Ui,V 



which is independent of y. Since Pn^m{B^) = 1 and for all xi,X2 £ B'^ we have 
Pn^m{{xi}) = Pn^m{{x2}) it follows that Pn^mi{y}) = = UmHy})] by additivity, 
for every Y C 2™ we have Pn^m{Y) = Um{Y) = \Y\ ■ 2"'". □ 

It is natural to check whether the independence and constant bias of the source probability 
space are essential for the validity of the von Neumann normalisation procedure. 

Example 8. The source probability space (5^ 2^ , Prob2) where Prob2(00) = 0, Prob2(01) = 
Prob2(10) = Prob2(ll) = 1/3 is independent and Prob2-^i = Ui. 

Example 9. The source probability space (^2,2^ ,Prob2) where Prob2(00) = Prob2(ll) = 
0,Prob2(01) = l/3,Prob2(10) = 2/3 is independent but Prob2^i / Ui. 

Comment. One could present the above examples in the more general framework of Theo- 
rem [71 

Theorem 10. Let m > 1 and n = 2m. Consider the source probability space 
(S",2^",Prob„) = n^i(52,2^',P^), where P^{01) = P^{10) , for all 1 <i<m. Then, in the 
target probability space (-B*", 2-^'", Prob„_j>m); where Prob„ = n™-^P2; we have Probn_s>m = 

Um ■ 

Proof. It is easy to check that for every y = yi . ■ - ym £ we have Prob„_s.m({yi . . . ym}) = 
1 P2{yiyi)l^'^o^n{yN~^{B'^)), SO Probn^m({yi . . . ym}) docs not depend on y (because 
Pliaa) = ^2(00)) for every a £ B). Hence, Prob„_i>m = Um- 

□ 



The source probability space (i?™, 2^™, Prob„) in Theorem 10 is not constantly biased 
and may be independent or not, but von Neumann normalisation still produces the uniform 
distribution under these conditions. 



Example 11. The source probability space (S^2^ ,Prob4) as in Theorem 10 where P2 (00) 
P2HOI) = 1/3,P2'(10) = 1/4,P2'(11) = 1/12 and P|(00) = 1/12,P|(01) = 1/4,P|(10) 
^2^(11) = 1/3 is not independent and Prob4_5>2 = U2. 
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The outcome of successive context preparations and measurements, such as is the case for 
the type of QRNG usually envisioned, are postulated to be independent of previous and future 
outcomes [T7j. This means there must be no causal link between one measurement and the 
next within the system (preparation and measurement devices included) so that the system 
has no memory of previous or future events. For QRNGs this translates into the condition 
that the probability that each successive bit is either or 1 is independent of the previous bit 
measured. We will only consider such independent probability spaces, as this is a necessary 
property of a good RNG, so most QRNGs are designed to conform to this requirement. 

The above assumption needs to be made clear as in high bit-rate experimental configura- 
tions to generate QRNs with, e.g., photons, its validity may not always be clear. If the wave- 
functions of successive photons "overlap" the assumption no longer holds and (anti)bunching 
phenomena may play a role. This is an issue that needs to be more seriously considered in 
QRNG design and will only become more relevant as the bit-rate of QRNGs is pushed higher 
and higher. While we leave study of the nature of these temporal correlations (and any non- 
independence they may cause) to future research [2], we pose the following open question 
which may help to quantify any possible effect they may have. 

Open Question. Fix an integer A; > and small positive real k. Consider the probability 
space (-B", 2^", Pn) where Pn is a modification of the probability satisfying the conditions 
that for ah i < n and Xi e B we have PniB'-^XiB'^-') = PliB'-^XiB""-'), 

PliB'-^XiB""-') - PliB^-^XiB'^-' I B'-^-^Xi^k ■ ■ ■ Xi_iB"-*"i) < K, 

and for all / > /c 

Pl{B''-^XiB''-'' I B'-^-^Xi_i . . . Xi-iB''-'-^) 
= Pl{B'-'x^B^~' I B'-'^-'xi.k ■ ..Xi^iB^-'-^). 

In other words, the probability of each bit depends on no more than the previous k bits, and 
the difference in probabilities for a bit between that given by Pn conditioned on the previous 
k bits and Pn is no more than k. If the output of such a source is normalised with the von 
Neumann procedure, how close is the resulting probability space of strings of length m to the 



uniform distribution (see Definition 17 for a definition of the closeness of probability spaces)? 



3.5 Normalisation of the output of a source with non-constant bias 

Now we consider the probability distribution obtained if von Neumann normalisation is applied 
to a string generated from an independent source with a non-constant bias — an "independent- 
bit source" . We consider only a bias which varies smoothly; this excludes the effects of sudden 
noise which could make the bias jump significantly from one bit to the next. Such a source 
corresponds to a QRNG in which the bias varies slowly (drifts) from bit to bit over time, but 
never too far from its average point. We choose this to model photon-based QRNGs since 
the primary cause of variation in the bias will be of this nature. For example, the detector 
efficiencies may vary as a result of slow changes in temperature or power supply. While 
abrupt changes — which this model does not account for — are plausible, their relatively rare 
occurrence (in comparison with the bit generation rate in the order of MHz) will mean they 
have little effect on the resultant distribution. 
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Let Pq,Pi < 1 and po + Pi = 1 be constant. Let x = X1X2 ■ ■ - Xn G be the generated 
string. Then define the probabihty of an individual bit Xj being either zero or one as 

xi fpo - if = 0, 

ypi + £i if Xj = 1. 

The variation in the bias is bounded, so we require that for all i, 

kil < /3, with j3 < nim{po,pi). 

Let 7j = Ej+i — Sj. Furthermore, we assume that the "speed" of variation be bounded, i.e. 
there exists a positive 6 such that 

bi\ < S, (4) 

for all i. Evidently we have S < P (presumably in any real situation 6 <^ j3); however, we 
introduce two separate constants since they correspond to two physically different (but related) 
concepts. Note that we will discuss in more detail the importance of these two parameters for 
the approximation of the uniform distribution and their relevance to calibration of the QRNG 
later once the analysis is completed. Indeed, the rate of change, 7^, is more important; the need 
for /3 stems from the need to realise that, even though the probabilities can fluctuate, they can 
only fluctuate in one direction for so long (since qi G [0, 1]), hence | '}2i7i\ = kn ~ £i| < 2/3. 

For a string y = yiyk ■ ■ .yk ^ and positive integer i we introduce, for convenience, the 
following notation: 

qi{y) = qTqT+i---q!lk-r 

The difference in probability between 01 and 10 depends only on 7^, and this allows us to 
evaluate the effect of normalisation on such a string: 

%(01) - %(10) = {pQ - ei){pi + Ei+i) - {pi + ei){pQ - £i+i) 
= (Po+Pi)(£i+i 

= li- (5) 

Let us first formally define the probability space generated by this QRNG. 

Proposition 12. The probability space of bit-strings produced by the QRNG is {B^, 2^" , Rn) 
where Rn : 2^" — )• [0, 1] is defined for all X C i?" as follows: 

Rn{X)=J2<ll{^)- (6) 

xex 

Proof. We verify only that Rn{B"') = 1, which is easily shown since qf+qj = 1, and Rn{B"') = 

□ 

Fact 13. For all i > 1 and x,y £ {0, 1}* we have: qi{xy) = qi{x)qi_^\x\{y) ■ 
Fact 14. For all k,n > 1, x e {0, 1}* with < k + \x\ < n we have: 

Rn = qn-k+i{x). (7) 
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Proof. Using Fact 13 we get: 

= X] (ll{v)(l\y\+l{x)q\y\+\^\+i{z) 
= qn-k+l{x) 1liy)%\+\^\+liz) 

= qn-k+l{x) Yl 91 (y) Y 

= qn-k+l{x)- 



Fact 15. T/ie probability space {B",2^'\Rn) with Rn defined in ^ is independent. 
Proof. Using ([T]) we have: 

Rn {xiX2 . . . Xk-lXkB"^'^^ = qi{xiX2 . . . Xk-lXk) 

= qi{xiX2 . . . Xk-i)qk{xk) 

= Rn (xiX2 . . . Xk-lB^-''+^) ■ Rn f S'-^XfeS^-'^ 



□ 



□ 

As with the constantly biased source, we consider the probability space Rn^m- We first 
investigate the simplest case n = 2m. In this situation, for any y £ we have VNnmiiy}) — 
{/(yi)/(y2) • • • f{ym)} and VN-UBn = {f{zi)f{z2) ■ ■ ■ f{zm) \z = zx...z^£ 5-}. 

Fact 16. The probability space of normalised bit-strings of length m = n/2 is (-B", 2^", 
where Rn^m : 2^" — ?• [0, 1] is defined for all Y C B"^ as follows: 

" RniVNnMB"^)) ~ ^yl\ ^2.-1(01) + ^2.-1(10) • 



3.6 Approximating of the uniform distribution 

Unhke the case for a constantly biased source, we no longer have qi{01) = qi{10); in fact 
by ([5]) we have (?i(01) = qi{W) + 7^. As a result the normalised equation is no longer the 
uniform distribution, but only an approximation thereof. We now explore how closely Rn—^rn 
approximates Um- 

We first need to define what we mean by approximating Um- 

Definition 17. The total variation distance between two probability measures P and Q 
over the space Q, is A{P,Q) = max^cn l-P(^) — Qi^)\- We say that P and Q are p-close if 
A{P,Q)<p. 
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It is well known (see for example that 
Lemma 18. For finite Q we have A{P,Q) = ^ J2xeQ ~ Qi{^})\- 

The variation A{Rn^m, Um) depends on each 7^ and qi (thus on po; Pi and each but 
we wish to calculate the worst case in terms of the bounds (5, /3 and poiPi, i.e. using Lemma 18 

J max V |i?„_„({y})-2— |. 



max A(i?- 
7iI'^i 



Let us first note that we can write 
q2i-i{f{yi)) 



q2i-i{f{yi)) 



q2i-l{0l) + (?2i-l(10) 2q2^-l{f{y^)) - {-l)yn2 

(_1)K^2^ 



i-1 



1 + 



i-1 



and hence we have 



R„ 



i=l 



n 1+ 



2q2^-lif{y^)) - i-l)yn2i-l 
(-l)^'72i-l 



92^-1(01) + g2i-l(10) 



We have rewritten the denominator in its original form to emphasise that only the signs (— 1)- 
depend on y. Thus, we want to find the values of q2i-i and 721-1 which maximise 



E 



1-n 1+ 



4 = 1 



(-l)2^'72i-l 



'72i-l(01) + g2i-l(10) 



(9) 



subject to the constraints that [7^ | < 5 and < (3 for 1 < i < n. 
Lemma 19. The function 



i=l 



n(i+(-iF'c.)-i 

l,...,n (note that for 1 < i < n, 



g{ci, . . . ,c„) = 

is strictly increasing for < Ci < 1, i 
g{ci, . . . ,Q, . . . ,c„) = g{ci, -Ci, . . .,Cn)). 

Proof. We take < Ci < 1 for 1 < i < n. For y = yi ■ ■ - yn & -B" define p{y,j) = HILi + 
(— l)^'Cj). Without loss of generality pick a j < n and let e > be an (arbitrarily small) 
positive real with Cj + e < 1. Note that 

g{ci,...,cn)= |(l + (-lFc,-My,j)-l|. 

We partition i?" as follows: 





= {y 


(1- 


Cj-s)p{y,j) - 1 > 0}, 


Y2 


= {y 


(1- 


Cj - s)p{y,j) - 1 < and (1 - Cj)p{y,j) - 1 > 0} 


Y3 


= {y 


(1- 


Cj)piy,j) - 1< and (1 + Cj)p{y,j) - 1 > 0}, 


Y4 


= {y 


(1 + 


Cj)p{y,j) - 1 < and (1 + cj + e)p{y,j) - 1 > 0} 


Y5 


= {y 


(1 + 


Cj+e)p{y,j) - 1< 0}. 
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Note that for y £ B^, p{y, j) > 0, and for y.j € li, i = 1, . . . , 5, we have 

p(.y5,j) <p{y4.,j) <p{y3,j) <p{y2,j) <p{yi,j), 

and ULi = -B". We have: 

5 

1=1 y&i 

= 5^ [(1 + {-If^cMy^j) - 1 + {-iy^ep{y,j)\ 

4 

+ E E (-1)'^ + (-l)'^c,My, j) - 1 + {-l)y^ep{y,j)] 

= E E 1(1 + - ii + 2£E E p(y^3) 

1=1 y&i i=2 yGYi 

- 2 [(1 - Cj)p{yJ) - 1] + 2 [(1 + c,)p(y, j) - 1] 
=c/(ci, . . . , Cj, . . . , c„) + 2e ^ p(2/, j) 

- 2 [(1 - cj - e)p{y,j) - 1] + 2 [(1 + + e)p{y,j) - 1] 

s/ey2 yen 
>g(ci,...,Cj,...,c„), 

where the final hne follows from the definition of Y2 and I4. Since this holds for all j < n, g 
is strictly increasing over [0, 1)". □ 

Hence in order to maximise ([9]) we need to maximise the functions 



7i 



q,{01)+q,{10) 



(10) 



(PO - ej){pi + + 7i) + (Pi + ^j){Po - £j - Ij) 
for j = 2i — 1, 1 < i < m, subject to the constraints < 5, \ej\ < (3 and = |ej-l-7j| < /3 

Lemma 20. For every j > 1 we have 



'■j i^j ) 7j ) 



< 



UjiP, -S) = Uj{l3 - 5, 6) ifpi > po, 
Uj{-13,6) = Uj{-/3 + 5,-6) ifpo>pi, 
5 



2[poPi-l3il3-5)-\po-pi\{P-5/2)] 
Proof. We omit the index j as it is not needed in this context. Let 



(11) 
(12) 
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{po - e){pi + e + -y) + {pi + e){po - e - 7) 
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Since ^(Ol) + ^(10) > 0, in order to maximise u we look for maxima and minima of v; clearly 
maxima have 7 > and minima have 7 < 0. We use Lagrange multipliers with inequality 
constraints to find the critical points. We have the following six constraints: /ii(e,7) = 
e-P <0, h2{e,j) = -e-(3 <0, h3{e,j) = e + 7 - /? < 0, h4{e,j) = -e - 7 - /? < 0, 
/i5(e, 7) = 7 — (5 < 0, hQ{e, 7) = —7 — 6 < 0. We must solve the following equations: 



^e,'yV{£, 7) + X] ^i^e,'yhi{e, 7) = 0, 
1=1 

\ihi{e,^) = for i = 1, ... ,6, 
hi{e, 7) < for i = 1, . . . , 6, 

{Aj > for minima, i = 1, . . . , 6, 
Aj < for maxima, z = 1, . . . , 6. 



(13) 

(14) 
(15) 

(16) 



We say a constraint is inactive if \i = and active otherwise; the condition of complimenta- 
rity ( |14[ ) captures the notion that a critical point satisfying the constraints either occurs at 
/ii(e, 7) = or is also a critical point in the unconstrained problem. 

Noting that 0<po~/3<Po + /3<l ^-nd solving, we find the candidate points are: 



'{\{po-Pi)±l^5) 

{P,-5),{P-5,5) for po - Pi < 2/3 - <5, 
S-P,5),{-l5 + 5,-5) for PI -po< 2/3 -(5. 



Note that u[e,^) = u{e + 7,-7). Testing values shows the second case maximises u(e,7) 
when pi > pq and the third cases maximises u{£, 7) for po > pi. For po = pi both cases give 
the same value. Substituting in £,7 and consolidating the cases we arrive at (12). □ 



Next we let 



where Uj(ej,7j) comes from (10) 
Then we have 



maxA(i?„^m, Ur, 



a = max uj {sj , 7j ) , 



m 



fe=0 



1 a 
2^2 



m—k 



Note that in this worst case, the normalised source acts as an independent and identically- 
distributed source with po = 1/2 ± a/2 and the total variation is bounded by that of two 
binomial sources: one with po = 1/2, the other with pQ = 1/2 it a/2 (the number k of 
successful outcomes is identified with the number of ones in y). 

There are two interesting questions: a) what is the quality of the distribution produced 
by a QRNG, i.e. how close are Rn^m and Um in terms of al and b) given a real p G (0, 1), 
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how accurate does the QRNG need to be in terms of a to guarantee that Rn^m and Um are 
p close? 

We can take a rough approach to solve the above problems as follows. First note that 



1 /I 

y^B^ i=l ^ 

< - y — ((1 1) 



So given a, Rn^m. and Um are at most ^ ((1 + a)*" — l)-close. Conversely, Rn->-m and Um are 
p close if 

a< (l + 2p)i/™-l. (17) 

We will express further results in the latter form, focusing on question b), although both are 
important questions depending on the operational circumstances and results can easily be 
transformed from one form to the other. 

So, by making a very small, Rn^m can be made as close as we wish to the uniform 
distribution. This is intuitive since a — >• only as — t- and we approach the constantly 
biased source situation. 

There are, unfortunately, some issues with this bound. First, as m — )• oo the bound 
on the variation becomes infinite too. This is unreasonable as by definition we should have 
A{R 

n^mi Um) ^ 1- It Only makes sense to talk about p < 1, although in any useful situation 
we will require p to be small (close to 0) so it is only of real importance that the bound is 
good in this situation. However, \17\ requires a to be significantly smaller than we really 
require for the two probabilities to be p close. Even for small p the bound is no-way near 
tight enough (see Figure [2]). Further, it would be instructive to examine more correctly the 
behaviour for large m and investigate fully the nature of the relationship between a, m and 
P- 

To rectify this and find a more reasonable bound, we carry out a finer analysis making 
use of the previous observation that this is the same problem as finding the variation between 
two binomial distributions. Let us denote a binomial probability distribution function for n 
trials and probability of success p as Sn,p '■ {0, . . . , n} — [0, 1] where for each ^ C {0, . . . , n}, 



^n,p(^) = E(l)?''(i-^r 

For < < 1, we then have 

A(5„,p,5„y) = 2 

fc=o ^ ^ 

and 



-k 



max A(_R„^m, — ^{Sm,l/2{\±a), Sm,l/2)- 
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Fact 21. For < p,p' < I we have A{Sn,p, Snw) = A.{Sn,i-p, Sn. 



i-p') 



The total variation between two binomial distributions can be given in terms of regularised 
incomplete beta functions [3]. 

Definition 22. The incomplete beta function is defined as 



Beia,b) 



u''-'il-u)''-^du 



For i = I we write Bi(a,6) = B(a, 6) for the complete beta function, or just beta function. 
The regularised incomplete beta function is defined as 



h{a,b) 



B£(a,6) 
B(a,6) ' 



Theorem 23. Let 0<p<l, g = l— and < x < q. The total variation between two 
binomial distributions with probability of success p and p + x is 



^{Sji^p, Sji^p+x) 



n 



where 



\np~\ < £ := £{n,p, x) 



p+x 

n I Sn-i,u{^ - l)du 

Ip+x{i,n-e + l)-Ipii,n-i + l), 
— nlog (1 — x/q) 



< \n{p + x)~\ . 



log (1 + x/p) — log (1 — x/q) 

Proof. The first line is from Adell and Jodra [3j- The rest follows from the well known 
properties of the beta functions: Bi{a, b) = B^(6, a) and 



(n + l)B(n-A; + l,A; + l) 



□ 



Theorem 24. The total variation is bounded by 



A/2(i+a) {e,m-i + l)- Ii/2(£, m-£+l), 
F{m-e;m,l/2-a/2) - F{m - i;m, 1/2) 



where 



and 



\m/2] <i = £{m,l/2,a/2) 



-mlog(l — a) 



log(l + a) - log(l - a) 



< \m{l + a)/2] 



F{k;n,p) = ^5„,p(a 



x=0 



is the cumulative distribution function for the binomial distribution. 
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Variation p as a function of a 
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Figure 1: Plot of p against a using the bound in Theorem [24] for four values of m: 100 
(dotted), 1,000 (dashed), 10,000 (dot-dashed) and 1,000,000 (solid). 



Proof. This follows directly from Theorem 23 and Fact 21 The last line follows from well 
known properties of the binomial distribution. □ 



This bound is exact (under the extrema given by Lemma 20), and we easily verify that 
A{Rn^rn, Um) < 1 sincc /p(a, 6) < 1 for all a, h and p < 1, and for p' > p we have Ip'{a, b) > 
Ip{a,b) (with equality only for p = p'). Unfortunately this bound on the variation has no 
simple closed form, so we can not easily relate a, m and p like we did in ( |17| ). The shape and 
nature of this relationship can be seen for various values of m in Figure [T| In practice, with m 
fixed and given p it is easy to compute (with numerical methods) a such that A (/?„_>„, Um) < 
p. For relatively small p however, we can find a simple and fairly good bound which is easy 
to work with for rough approximations. 



Theorem 25. Assume that m = n/2. Consider the probability spaces {B^ ,2^"" , R„ 
[B"^, 2^"^ , Um)- For every real p such that < p < 1, if 



and 



a < p\ 



'27r(l 



_2_' 

m I 



m + 1 

then A{Rn^m, Um) < P- 

Proof. We will take a first order (linear) approximation of /S.{Sm,i/2^ 'S'm,i/2(i+a)) around a 
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0. From Theorem 1231 and the Fmidamental Theorem of Calculus we have 

$(a) := Aa(5„^i/2,5„,i/2(i+«)) = m('^J^^)2--(l + a)^-i(l-ar-^ 

Since (. > [m/2] we have 

<I>(a) < ^>(0), 
so our first order upper bound is given by 



A(5'm,l/2,5'm,l/2(l+a)) < "^(0) = ""^ ( ^ _ 1 



2~rr 



Since the central binomial coefficient (i.e. (|^„'/2j)) is the largest, for A; < m — 1 we have 

/ m — 1\ / m — 1\ / m — 1 \ 

which can easily be shown by taking the two cases of m odd and m even. Since i > [m/2] 
we have that 

.(o,.2-"™(-_\)..-"™m(™)..-™r™/,(™). 

Using the bounds given in Corollary 2.3, [28 1, and writing m = a [m/2] where a < 2, we have 



1 m'^+i 



I m I I 1 r m ' 

nk \ L 2 J "'"2 \nk^ I 2 



2 



< 



< 



< 



< 



2 



1 2™+ 5 



2-m (i-^)L^J (1-^)^ 



1 2 



vr^fl (1-2^) (1-^)^ 

2m 



Hence, we have 



□ 
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This bound is much much better than the bound given in (17), and for small a is extremely 



good. It has the desired properties that as a — >• 0, the bound on the variation tends to also. 
Obviously this bound is not less than one for all a, but for small p the bound is very good, 
as can be seen in Figure [2j 



Upper Bounds on the Total Variation A(^„^,„,C/„,) 
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Figure 2: Plot of upper bounds on the variation between Rn^m and Um- 



Another interesting question refers to the possibility of manipulating the parameter a 
for fine calibration of the QRNG. For Rn^m to become closer to Um we need to make a 
smaller, but this can be done by adjusting both 5 and (3. As previously discussed, both are 
reasonable physical parameters, and which one is the most suitable (or easiest) to decrease 
experimentally will to a large extent depend on the QRNG set-up itself. However, adjusting 
5 has a larger effect on a than adjusting j3 does, and Rn^m will only approach Um arbitrarily 
close as (5 — )• 0, as even with /3 = 6 (recall 6 < (3) we do not have a = unless 6 = 0. 

These results can be extended to all m < n/2, although the analysis is rather ellaborated. 
The key difference is that in the definition of Rn^m in ^ the set V N~^l^{Y) no longer has the 
same size as Y, so an additional summation is needed in the right hand side of ([8|). However, 



the total variation will still be maximised under the same conditions as in Lemmata 19 and 20 
and the same relation as in Theorem 1241 holds. 



It is worth noting that the conditions which maximised the variation in (11) correspond 
to every Si being the same up to a small variation 5. Physically this would indicate that po,pi 
have been incorrectly stated, but that the device is actually rather accurate except for a small 
drift in probabilities of no more than 6. Since the parameters £i are supposed to physically 
account for the amount the probability is allowed to drift, which will normally be much more 
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than the drift between individual bits (the 7j), if the device is cahbrated so that po and pi 
are centred so that the Ei are distributed around them, then the variation will not be nearly 
as bad as in this worst case. However, the bound on the variation remains valid as it is not 
necessarily meaningful (or useful) to look into the physical situation under which the worst 
case bound is achieved. 

We briefly wish to point out that other methods for dealing with independent-bit sources 
have been proposed. For example, grouping bits into blocks of size i and taking the parity of 
these bits for the "normalised" bit, produces a string of length n/i [33]. With this method 
each bit becomes unbiased exponentially fast in i. However, the bound in Theorem 25 is 



asymptotically tighter than the corresponding bound that can be obtained by the parity 
method if the block size i is fixed; if i scales polynomially with n then this method produces a 
better bound, but at a substantial cost to the number of bits produced [Ml Proposition 6.5]. 
The reason the von Neumann normalisation outperforms the parity method is due to the fact 
that the bias is required to vary slowly. 



4 The infinite case 

The extension of the above results to infinite sequences of bits produced by QRNGs is fairly 
straightforward, but forces us to address a few unexpected problems. First, we must extend the 
definition of the normalisation function VNn^m to sequences. We define VN : — )■ U B* 
as 

^^^^^(x = xi . . .Xn . . ■) = F{xiX2)F{x3X4) ■ ■ ■ -^(a^g _ia;2 [f J ) ■ ■ ■ • 
For convenience we also define VNn : B^ — )■ (^IJfc<n -^^) {^} 

VNn{x) = F{XIX2)F{X3X4) ■ ■ ■ F {x^^n ^_-^^X^^r. ]^ ) = VNn,n{xi ■ .-Xn)- 

Secondly, we introduce the probability space of infinite sequences as in [H]. Let Aq = 
{oi, . . . , qq}, Q > 2 be an alphabet with Q elements. We let V = {xAq \ x S Aq} U {0} and 
C be the class of all finite mutually disjoint unions of sets in V; the class V can be readily 
shown to generate a cr- algebra Ai. Using Theorem 1.7 from [8], the probabilities on Ai are 
characterised by the functions h : Aq — t- [0, 1] satisfying: 

1. h{X) = 1, 

2. h{x) = h{xa^) H h h{xaQ), for aU x £ Aq. 

li Q = 2 so A2 = B, and for x £ i?" we take h{x) = Pn{{x}) with P„ as defined in 
Fact [T| then the above conditions are satisfied. This induces our probability measure fip on 
Ai, which satisfies iip{XB'^) = Pn{X) for X C S". Hence the suitable extension of the finite 
case probability space to infinite generated sequences is the space {B'^ ,A4, fip). In the special 
case when po = pi we get the Lebesgue probability ^p^{XB^) = Ylx<^x 

In general, \i Q > 2, pi > Q iov i = 1 , . . . , Q are reals in [0,1] such that Yl^=i Pi = we can 

take hQ{x) = . . .pg""^^ ^ {H^ai{x) is the number of occurrences of a, in x) to obtain 

the probability space (Ag, A^, /xp^^) in which hp^^xAq) = hg^x), for all x £ Aq. 

The first result notes that there exist sequences x G B'^ such that VN{x) £ B* . In fact 
every string can be produced via von Neumann normalisation from a suitable sequence. 
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Theorem 26. For every string y € B* there exists an uncountable set R C of fip measure 
zero such that for all x £ R, VN{x.) = y. 

Proof. Let y = yi . . . y^ £ B* and D = {00, 11}, the two-bit blocks which are deleted by von 
Neumann normalisation and y' = /(yi) . . . Then every sequence x S y'V^ satisfies 
VN{x) = VN2n{^)VN{x2n+iX2n+2---) = V sincc yiV2n(x) = VN2n,2n{y') = V and for aU 
z G we have VN{z) = A. Obviously, the set R = y'D'^ is uncountable and has /ip measure 
zero as the set of Borel normal sequences has measure one [S] . □ 

Corollary 27. The set Q = {x G i?"^ | VN{x) £ B*} has fip measure zero. 

Proof. We simply note that the union of countably many measure zero sets also has measure 
zero. □ 

It is interesting to note that the "collapse" in the generated sequence produced by von 



Neumann normalisation in Theorem 26 is not due to computability properties of the sequence. 
In particular, there are random sequences that collapse to any string, so to strings which are 
not Borel normal. 

In the following we need a measure-theoretic characterisation of random sequences, so we 
present a few facts from constructive topology and probability. 

Consider the compact topological space {Aq,t) in which the basic open sets are the sets 
wAq, with w G Aq. Accordingly, an open set G C Aq is of the form G = VAq, where 
VCA*Q. 

From now on we assume that the reals Pi,l < i < Q which define the probability fipg 
are all computable. A constructively open set G C Aq is an open set G = VAq for which 
V C Aq is computable enumerable (c.e.). A constructive sequence of constructively open sets, 
for short, c.s.c.o. sets, is a sequence ((j'm)m>i of constructively open sets Gm — VmAQ such 
that there exists a c.e. set X C A*q x N with Vm = {x G A*q \ {x,m) G X}, for all natural 
m > 1. A constructively null set S C Aq is a set for which there exists a c.s.c.o. sets {Gm)m>i 
with S C C\ni>i ^rn, ^PglCm) < 2""^. A Sequence x G Aq is random in the probability space 
{AqjM, fJ-Pg) if X is not contained in any constructively null set in {Aq, Ai, fip^). For the case 
of the Lebesgue probability fip^ the measure-theoretic characterisation of random sequences 
holds true: x is random if and only if x is not contained in any constructively null set of 
(A-,X,/.pJ [23118]. 

We continue with another instance in which von Neumann normalisation decreases ran- 
domness. 

Proposition 28. There exist (continuously many) infinite 1/2-random sequences x G B^ 
such that yiV(x) = 000 ... 00 ... . 

Proof. Consider a random sequence x = xiX2 ■ ■ ■ Xn ■ ■ ■ and construct the sequence x' = 
0xi0x2 • • . Oxn .... Clearly, x' is 1/2-random, but VN{x') = 000 . . . 00 . . . because there exist 
infinitely many I's in x. □ 

In the following examples von Neumann normalisation conserves or increases randomness. 

Proposition 29. There exist (continuously many) infinite 1/2-random sequences x G B'^ 
such that VN{x.) is random. 
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Proof. Consider a random sequence x = xia;2 . . . x„ . . . and construct the sequence x' = 
xi2;iX2X2 • • • XnXn ■ ■ ■ ■ Clearly, x' is 1/2-random and VN{x') = x. □ 



Comment. Both Proposition [28] and 29 are true for the more general case of e-random 
sequences, where < e < 1 is computable. 

We briefly note that in the definition of Borel normality it does not matter if we count 
the number of non-overlapping occurrences of each string of length m, N^{y) as defined in 
Section [2| or the number of overlapping occurrences, J\ff^{y) pU] . 

Theorem 30. Let x G B'^ be Borel normal in {B'^ , M, iJ.p^). Then VN{-x.) is also Borel 
normal in (B^ , A4, fipj^) . 

Proof. Note that VN(x.) G B^ because x contains infinitely many occurrences of 01 on 
even/odd positions. Let D = {00, 11}, x*(n) = V Nn,n{'^{n)) , n' = |x*(n)|. We have 

lim ^ . = lim ' * ' * 



n'^oo \n' J \ n 
but as n — 7- oo, n' — t- oo. We thus have 

n'-s>oo n 7i'-5>oo n 

^ ^.^ AAo\(x(n))+AAfo(x(n)) 
= 2-1 

by the normality of x. The number of occurrences of each i = ii . . . e fi™ in x*(n) 
is the number of occurrences of i' = f{ii)yif{i2)---ym-if{im) in x(n), summed over all 
yi, . . . , ym-i G D* ■ Viewing i' as a string over {00, 01, 10, 11} we have: 

iVr(x-(n)) ^ ^.^ E,,,....,^_,A^f'(x(n)) 
n'^-oo n ra-s>oo n 

oo oo oo 

= XI X] ^ 2l^'"-il2-2|*'l 

l!/l|=0 |?/2|=0 \ym^i\=o 

oo oo oo 

_ 2-2m ^ 2-l^il ^ 2-'^2'--- ^ 2-l^™-il 

\yi\=0 |3/2|=0 |3/m-l|=0 

2~2m2m— 1 

_ 9-(m+l) 
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Hence, both limits exist and we have 



hm ^ . = lim ' * ' * 



n'-5>oo n' n'-5>oo \n' / \ n 

nni„'_^oo ri 



limn'-5-oo 



2-(m+l) 

" 2-1 
= 2""". 

Since this holds for all m, i we have that yA^(x) is Borel normal. □ 

Let Aq = {ai, . . . ,aQ}, Q > 3. Let Yl?=iPi = 1 where pi > for z = 1,...,Q 
and {Aq, M., fip^) be the probability space defined by the probabilities pi. Let Aq^i = 
{ai, . . . ,aQ-i} and {Aq_^,M., ^pT ^) be the probability space defined by the probabilities 

T , PQ \ Pi 
Pi =Pi\^ + „o_i ' = 



with 1 < i < Q — 1. Let T : Aq — t- Aq_^ be the monoid morphism defined by T{ai) = ai 
for 1 < i < Q - 1, T{aQ) = A; T{x) = T{xi)T{x2) ■ • •T(x„) for x G A^. As T is prefix- 
increasing we naturally extend T to sequences to obtain the function T : Aq — )• Aq_^ given 
by T(x) = lim„^oo r(x(n)) for x G 

Lemma 31. The transformation T is (hpq, UpT )-preserving, i.e. for all w E ^*q-i we have 
I^Pq (t-\wA-q_^)) = i,pT_^ [wAq-i)- 
Proof. Take w = wi . . . Wm E We have: 

fiP^ {T-\wA^Q_^)) = i^p^ ({x G I C r(x)}) 

= iipq |a^t(;io*^^t(72 • • • a'^^WmZ \ z G Aq^ 

oo 

= ^ hQ (^a'(^wia'^W2 ■ ■ ■ a'^Wr. 

ii,...,im=0 

oo 

ii,...,im=0 

= hn-liw) ■ 

^ ^ ' l-PQ 



□ 



Proposition 32. Ifx€ Aq is random in (Aq, A4, fipg) andT is the transformation defined 



in Lemma 



31 



then T(x) is random in {A'Q_'^,A4,iipT 
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Proof. We generalise a result in [TD] stating that, for the Lebesgue probability, measure- 
preserving transformations preserve randomness. Assume that x is random in {Aq, M, fip^) 
but T(x) is not random in in {AQ_^,Ai, UpT ^), i.e. there is a constructive null set R = 

{Gm)m>i containing r(x). Assume that Gm = XmA^_^, where Xm C ^q_i is c.e. and has 
the measure /ipT {XmA^_^) smaller than 2"™. Define Sm = T-i(X„A^_ J C A^ and note 
that Sm is open because it is equal to IJ«,eXm ^^^Q '^ith Vu, = {v £ Aq \ w IZ T{v)} and. 



using Lemma 31, has the measure smaller than 2 

^^PQ{Sm) = /^Pq ( U Ko^Q J 
^ (Xm^Q-l) 



< 2"". 



We have proved that x is not random in (^g, ^up^), a contradiction. □ 

Let us define VN-^ : 2^* ^ 2^* for x = xi . . . x„ G as 

yiV~^(x) = {y I y = Uif{xi)u2 ■ ■ ■ Umf{xm)um+iv and 
Ui £ {00, 11}* ior I <i <m,v G B\J {A}} 

oo 



n=Q 



and for X C as 



For all X G and y G yiV"i(x)S'^ we then have x IZ yA^(y). 

For the cases that VN{x) G B^, the probability space {B^ , M., fip^j^) induced by von 
Neumann normalisation is endowed with the measure fJ-PyM- The measure jUp^,j^ is defined on 
the sets xB'^ with x G i?* by 

MVN-\x)Bn 
'^''^^^ ' /ip(y7V-i(Sl-l)S-)' 

By noting that VN~^{B\^\) C yA''~-^(i?*) it is clear to see that fspyj^ satisfies the Kolmogorov 
axioms for a probability measure. While the set VN-^{B\''\) contains sequences for which 
normalisation produces a finite string, from Corollary [27] we know that the set of such se- 
quences have measure zero, so the definition of ^p^^ is a good model of the target probability 
space. 

Scolium 33. Let x G B'^ be random in {B'^ , A4, fip). Then VN{x.) G B^ is also random in 
{B'^,M,fip^J. 
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Proof. We write the random sequence x as x = xiX2 . . . x„ • • • = (X1X2) • • • {x2n-iX2n) • • • £ 
{00, 01, 10, 11}'^. Renaming a = 00,A = 01,B = 10, 6 = 11 and consistently deleting first all 
occurrences of a we get a random sequence :x.A,B,b on the alphabet {A, B, b}, then deleting all 
occurrences of b we get a random sequence xa,b on the alphabet {A,B}. The result follows 
from the fact that VN(x) = xo,i and Proposition 32 stating that xa,b is random. □ 

Corollary 34. If x £ B^ is random in (B'^ fip) then VN{x) is Borel normal in 
{B'^,M,fip,.^). 



Proof. From Scolium 33 it follows that VN{x) is Borel normal provided x is random [H]. □ 

Theorem 35. The probability space {B'^ , Ai, fip^^) induced by von Neumann normalisation 
is the uniform distribution (B^ ,M, fip^), where ^p^ is the Lebesgue measure. 



Proof. By Lemma 31 von Neumann normalisation is measure preserving, so for x G -B* we 
have 

f,P^^ixB'^) = MVN-\x)Bn 

\x\ \x\ sr^ #o{rfi...rf|,|) #i(di...<i|,|) 
= Po Pi Po Pi 

The key point, as in the finite case, is that this only depends on \x\ not x itself. By using the 
fact that for any n, YlxeB" I^Pyn^^^^) ~ "^^ have 

/.p,^(xi?-) = 2-N 

for all X £ B*, and hence (J-Pyn — I^Pl-> Lebesgue measure. □ 

This can easily be extended from the case when VN{x) is infinite, to the case in which 
it is finite. To do so, note that if y G B'^ and VN{x) = y £ -B", then the probability space 
induced by von Neumann normalisation is (-B", 2^", Pn)- We then have 

_ l^p{VN-\x)D-) 



^ip{VN-'^{B^)D'^) 

and since the denominator is constant for all x G B"', we can proceed as for above, and 
P* = Un as desired. 

Theorem 36. The set {x G -B'^ | VN{x) G B* or VN{x) G B^ is computable } has measure 
zero with respect to the probability space {B^ , Ai, fj,p) . 

Proof. By Scolium [33] we deduce that 

{x G 5'^ I VN{x) G B'^ is computable } C {x G S'^ | x is not random in {B'^,M,hp)}, 



which has measure zero |23j . To complete the proof, note that we know from Corollary 27 
that the set {x G B^ \ VN{x) G B*} also has measure zero. 

□ 
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5 Role of probability spaces for QRNGs 



The treatment of QRNGs as entirely probabilistic devices is grounded purely on the probabilis- 
tic treatment of measurement in quantum mechanics which originated with Born's decision 
to "give up determinism in the world of atoms" [6] , a viewpoint which has become a core part 
of our understanding of quantum mechanics. This is formalised by the Born rule, but the 
probabilistic nature of individual measurement is nonetheless postulated and tells us nothing 
about how the probability arises. Along with the assumption of independence this allows us 
to predict the probability of successive events, as we have done. 

No-go theorems such as the Kochen-Specker Theorem tell us something stronger: 
if we assume non-contextuality (i.e. that the result of an observation is independent of the 
compatible observables are co-measured alongside it jHHS]) then there can, in general, be no 
pre-existing definite values prescribable to certain sets of measurement outcomes in dimension 
three or greater Hilbert space. In other words, the randomness is not due to ignorance of 
the system being measured; indeed, since there are in general no definite values associated 
with the measured observable it is surprising there is an outcome at all |31] . While this does 
not answer the question as to where the randomness arises from, it does tell us something 
stronger than the Born Rule does. In it is shown that every infinite sequence produced 
by a QRNG is (strongly) incomputable. In particular, this implies that it is impossible for 
a QRNG to output a computable sequence. The set of computable numbers has measure 
zero with respect the probability space of the QRNG, but the impossibility of producing such 
sequence is much stronger than, although not in contradiction with, the probabilistic results. 

In the finite case every string is, of course, obtainable, and we would expect the distribution 
to be that predicted by the probability space derived from the Born Rule. However, the infinite 
case has something to say here too. We can view any finite string produced by a QRNG as 
the initial segment of an infinite sequence the QRNG would produce if left to run indefinitely. 
For any infinite sequence produced by the QRNG, it is impossible to compute the value of any 
bit before it is measured [1]; in the finite case this means there is no way to provably compute 
the value of the next bit before it is measured. In light of value indefiniteness this is not 
unexpected, but nonetheless gives mathematical grounding to the postulated unpredictability 
of each individual measurement, as well as the independence of successive measurements — 
indeed we can rule out any computable causal link within the system which may give rise to 
the measurement outcome. 

The results we have presented in this paper, however, describe thoroughly the distribution 
of strings/sequences produced by QRNGs. With the distributions known we can create more 
intelligent tests of the quality of output of a QRNG [9J. Current statistical tests for analysing 
RNGs are designed with pseudo-RNGs in mind, and are not necessarily the best way to test 
the quality of QRNGs. The effects of normalisation on strings generated by QRNGs can 
help us design QRNGs which are more robust to experimental imperfection and exhibit the 
desired behaviour. It will further aid in developing new normalisation techniques designed 
to produce the expected (ideal) theoretical distribution even in the absence of experimental 
imperfections. 
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6 Conclusions 



The analysis developed in this paper involves the probability spaces of the source and output 
of a QRNG and the effect von Neumann normalisation has on these spaces. 

In the "ideal case" , the von Neumann normalised output of an independent constantly 
biased QRNG is the probability space of the uniform distribution (un-biasing). This result is 
true for both for finite strings and for the infinite sequences produced by QRNGs (the QRNG 
runs indefinitely in the second case). 

For a real-world QRNG in which the bias, rather than holding steady, drifts slowly, we 
evaluated the speed of drift required to be maintained by the source distribution to guarantee 
that the output distribution is arbitrarily close to the uniform distribution. It is an open 
question to study the more realistic case when, instead of the bits being independent, the 
probability for each bit depends on a finite number of preceding bits (for example, because of 
the high bit-rate of the experiment). 

We have also examined the effect von Neumann normalisation has on various properties of 
infinite sequences. In particular, Borel normality and (algorithmic) randomness are invariant 
under normalisation, but for e-random sequences with < e < 1, normalisation can both de- 
crease or increase the randomness of the source. It is an open question whether von Neumann 
normalisation preserves randomness and Borel normality for finite strings. 

Finally, we reiterate that a successful application of von Neumann normalisation — in, 
fact, any un-biasing transformation — does exactly what it promises, un-biasing, one (among 
infinitely many) symptoms of randomness; it will not produce "true" randomness. 
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